Dataset statistics

Number of variables18
Number of observations2874
Missing cells583
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.9 MiB
Average record size in memory5.7 KiB

Variable types

Categorical6
Numeric12

Alerts

title has a high cardinality: 1255 distinct values High cardinality
id_thread has a high cardinality: 2865 distinct values High cardinality
tokens has a high cardinality: 2454 distinct values High cardinality
interlocutors has a high cardinality: 1766 distinct values High cardinality
dates has a high cardinality: 2170 distinct values High cardinality
n_posts is highly correlated with n_interlocutors and 7 other fieldsHigh correlation
n_interlocutors is highly correlated with n_posts and 7 other fieldsHigh correlation
mean_post_per_interlocutor is highly correlated with n_posts and 6 other fieldsHigh correlation
mean_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 6 other fieldsHigh correlation
max_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 6 other fieldsHigh correlation
n_tokens is highly correlated with n_posts and 8 other fieldsHigh correlation
mean_tokens is highly correlated with n_tokens and 4 other fieldsHigh correlation
min_tokens is highly correlated with n_posts and 3 other fieldsHigh correlation
max_tokens is highly correlated with n_posts and 8 other fieldsHigh correlation
n_tokens_stopwords is highly correlated with n_posts and 8 other fieldsHigh correlation
mean_tokens_stopwords is highly correlated with n_tokens and 4 other fieldsHigh correlation
n_posts is highly correlated with n_interlocutors and 5 other fieldsHigh correlation
n_interlocutors is highly correlated with n_posts and 5 other fieldsHigh correlation
n_anonymes is highly correlated with n_posts and 4 other fieldsHigh correlation
mean_post_per_interlocutor is highly correlated with n_tokens_stopwordsHigh correlation
mean_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 4 other fieldsHigh correlation
max_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 4 other fieldsHigh correlation
n_tokens is highly correlated with n_posts and 6 other fieldsHigh correlation
mean_tokens is highly correlated with min_tokens and 2 other fieldsHigh correlation
min_tokens is highly correlated with mean_tokens and 1 other fieldsHigh correlation
max_tokens is highly correlated with n_tokens and 3 other fieldsHigh correlation
n_tokens_stopwords is highly correlated with n_posts and 4 other fieldsHigh correlation
mean_tokens_stopwords is highly correlated with mean_tokens and 2 other fieldsHigh correlation
n_posts is highly correlated with n_interlocutors and 5 other fieldsHigh correlation
n_interlocutors is highly correlated with n_posts and 5 other fieldsHigh correlation
mean_post_per_interlocutor is highly correlated with n_posts and 5 other fieldsHigh correlation
mean_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 5 other fieldsHigh correlation
max_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 5 other fieldsHigh correlation
n_tokens is highly correlated with n_posts and 8 other fieldsHigh correlation
mean_tokens is highly correlated with n_tokens and 3 other fieldsHigh correlation
max_tokens is highly correlated with n_tokens and 3 other fieldsHigh correlation
n_tokens_stopwords is highly correlated with n_posts and 8 other fieldsHigh correlation
mean_tokens_stopwords is highly correlated with n_tokens and 3 other fieldsHigh correlation
n_posts is highly correlated with n_interlocutors and 5 other fieldsHigh correlation
n_interlocutors is highly correlated with n_posts and 5 other fieldsHigh correlation
n_anonymes is highly correlated with n_posts and 5 other fieldsHigh correlation
mean_post_per_interlocutor is highly correlated with n_tokens and 1 other fieldsHigh correlation
mean_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 5 other fieldsHigh correlation
max_post_per_interlocutor_with_anonymous is highly correlated with n_posts and 5 other fieldsHigh correlation
n_tokens is highly correlated with n_posts and 7 other fieldsHigh correlation
mean_tokens is highly correlated with min_tokens and 2 other fieldsHigh correlation
min_tokens is highly correlated with mean_tokens and 2 other fieldsHigh correlation
max_tokens is highly correlated with n_tokens and 4 other fieldsHigh correlation
n_tokens_stopwords is highly correlated with n_posts and 7 other fieldsHigh correlation
mean_tokens_stopwords is highly correlated with mean_tokens and 2 other fieldsHigh correlation
title has 583 (20.3%) missing values Missing
n_posts is highly skewed (γ1 = 21.42687354) Skewed
n_interlocutors is highly skewed (γ1 = 21.42687354) Skewed
n_anonymes is highly skewed (γ1 = 40.68570103) Skewed
mean_post_per_interlocutor_with_anonymous is highly skewed (γ1 = 48.14186938) Skewed
max_post_per_interlocutor_with_anonymous is highly skewed (γ1 = 39.66133588) Skewed
id_thread is uniformly distributed Uniform
n_anonymes has 1166 (40.6%) zeros Zeros
mean_post_per_interlocutor has 1007 (35.0%) zeros Zeros
mean_post_per_interlocutor_with_anonymous has 86 (3.0%) zeros Zeros
n_tokens has 62 (2.2%) zeros Zeros
mean_tokens has 62 (2.2%) zeros Zeros
min_tokens has 358 (12.5%) zeros Zeros
max_tokens has 62 (2.2%) zeros Zeros
n_tokens_stopwords has 59 (2.1%) zeros Zeros
mean_tokens_stopwords has 59 (2.1%) zeros Zeros

Reproduction

Analysis started2021-11-18 15:53:33.289417
Analysis finished2021-11-18 15:55:14.399389
Duration1 minute and 41.11 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

title
Categorical

HIGH CARDINALITY
MISSING

Distinct1255
Distinct (%)54.8%
Missing583
Missing (%)20.3%
Memory size209.2 KiB
Discussions
186 
Avis
163 
Supprimer
 
153
Avis non décomptés
 
131
Conserver
 
90
Other values (1250)
1568 

Length

Max length147
Median length17
Mean length19.95024007
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1189 ?
Unique (%)51.9%

Sample

1st rowBandeaux à foison
2nd rowTon de l'article
3rd rowProposition de fusion entre [[Industrie de la houille blanche]] et [[Houille blanche]]
4th rowmon article industries de la houille blanche en Maurienne
5th rowarticles enrichis

Common Values

ValueCountFrequency (%)
Discussions186
 
6.5%
Avis163
 
5.7%
Supprimer153
 
5.3%
Avis non décomptés131
 
4.6%
Conserver90
 
3.1%
Fichier proposé à la suppression sur Commons65
 
2.3%
Liens externes modifiés51
 
1.8%
Votes25
 
0.9%
Neutre18
 
0.6%
Avis divers non décomptés16
 
0.6%
Other values (1245)1393
48.5%
(Missing)583
20.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
avis315
 
4.8%
de300
 
4.6%
discussions193
 
3.0%
186
 
2.9%
la166
 
2.6%
non159
 
2.4%
supprimer156
 
2.4%
décomptés147
 
2.3%
à127
 
2.0%
et104
 
1.6%
Other values (2271)4646
71.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

id_thread
Categorical

HIGH CARDINALITY
UNIFORM

Distinct2865
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size185.4 KiB
2927500_2
 
2
58110_4
 
2
2141430_2
 
2
413290_23
 
2
3094690_2
 
2
Other values (2860)
2864 

Length

Max length11
Median length9
Mean length9.011134308
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2856 ?
Unique (%)99.4%

Sample

1st row11324890_1
2nd row11324890_2
3rd row11324890_3
4th row11324890_4
5th row11324890_5

Common Values

ValueCountFrequency (%)
2927500_22
 
0.1%
58110_42
 
0.1%
2141430_22
 
0.1%
413290_232
 
0.1%
3094690_22
 
0.1%
1309500_22
 
0.1%
1881180_22
 
0.1%
6329110_42
 
0.1%
1277240_22
 
0.1%
10482000_21
 
< 0.1%
Other values (2855)2855
99.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
2927500_22
 
0.1%
2141430_22
 
0.1%
413290_232
 
0.1%
3094690_22
 
0.1%
1309500_22
 
0.1%
1881180_22
 
0.1%
6329110_42
 
0.1%
1277240_22
 
0.1%
58110_42
 
0.1%
1166590_11
 
< 0.1%
Other values (2855)2855
99.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

n_posts
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct60
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.453723034
Minimum1
Maximum437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile15
Maximum437
Range436
Interquartile range (IQR)3

Descriptive statistics

Standard deviation11.52131977
Coefficient of variation (CV)2.58689633
Kurtosis723.0871271
Mean4.453723034
Median Absolute Deviation (MAD)0
Skewness21.42687354
Sum12800
Variance132.7408093
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11481
51.5%
2349
 
12.1%
3197
 
6.9%
4131
 
4.6%
5116
 
4.0%
779
 
2.7%
672
 
2.5%
872
 
2.5%
952
 
1.8%
1150
 
1.7%
Other values (50)275
 
9.6%
ValueCountFrequency (%)
11481
51.5%
2349
 
12.1%
3197
 
6.9%
4131
 
4.6%
5116
 
4.0%
672
 
2.5%
779
 
2.7%
872
 
2.5%
952
 
1.8%
1044
 
1.5%
ValueCountFrequency (%)
4371
< 0.1%
1901
< 0.1%
1291
< 0.1%
781
< 0.1%
742
0.1%
731
< 0.1%
711
< 0.1%
641
< 0.1%
631
< 0.1%
601
< 0.1%

tokens
Categorical

HIGH CARDINALITY

Distinct2454
Distinct (%)85.4%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
['discussion', 'ci-dessous']
 
130
[]
 
62
['message', 'déposer', 'automatiquement', 'robot']
 
52
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'récemment', 'inscrire', 'contribution', 'non', 'identifiable', 'IP', 'opinion', 'non', 'signer', 'principe', 'prendre', 'compte', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']
 
48
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'récemment', 'inscrire', 'contribution', 'non', 'identifiable', 'IP', 'principe', 'prendre', 'compte', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']
 
39
Other values (2449)
2543 

Length

Max length53764
Median length300
Mean length867.4822547
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2418 ?
Unique (%)84.1%

Sample

1st row['rarement', 'article', 'fournir', 'bandeau', 'relever', 'article', 'respecte', 'doute', 'guère', 'grammaire', 'wikipédienn', 'lucky', 'Luke', 'bandeau', 'encombrer', 'guère', 'bon', 'usage', 'absence', 'débat', 'pdd', 'regrettable', 'y', 'page', 'fort', 'utile', 'bref', 'article', 'éminemment', 'perfectible', 'forme', 'diversité', 'sourçage', 'sauver', 'avis', 'Bonjour', 'poseur', 'bandeau', 'article', 'admissibilité', 'article', 'préciser', 'chose', 'considérer', 'lucky', 'Luke', 'bandeau', 'répondre', 'bandeau', 'poser', 'Copyvio', 'contributeur', 'remercier', 'rédaction', 'fournir', 'lien', 'accès', 'ressource', 'lien', 'accès', 'ressource', 'thèse', '559', 'page', 'y', 'copier', 'coller', 'section', 'ti', 'Complétons', 'fin', 'ri', 'Wikipédia', 'travail', 'inédit', 'opinion', 'excessivement', 'minoritaire', 'associer', 'source', 'juger', 'confidentiel', 'fiable', 'voire', 'simplement', 'interprétation', 'déduction', 'intuition', 'personnel', 'rédacteur', 'article', 'exemple', 'section', 'évolution', 'jour', 'facteur', 'jouer', 'faveur', 'vallée', 'fin', 'xix', 'siècle', 'retourner', 'inconvénient', 'disparaître', 'rente', 'énergétique', 'commander', 'couplage', 'usine', 'centrale', 'hydroélectrique', 'section', 'glorieux', 'conglomérat', 'omniprésent', 'vallée', 'falloir', 'oublier', 'envergure', 'non', 'national', 'exercer', 'stratégie', 'champ', 'mondial', 'doute', 'solidité', 'ancrage', 'nord-alpin', 'section', '1914', '1939', 'mention', 'spécial', 'faire', 'usine', 'Epierre', 'bas', 'Maurienne', 'four', 'électrique', 'origine', 'jusqu’', 'fermeture', 'consacrer', 'fabrication', 'dérivé', 'phosphore', 'agir', 'relocalisation', 'firme', 'Coignet', 'origine', 'lyonnais', 'terme', 'pérégrination', 'bandeau', 'sourçage', 'article', 'mérite', 'certainement', 'mieux', 'source', 'bien', 'qualité', 'unique', 'source', 'fête', '40', 'an', 'poser', 'problématique', 'mise', 'jour', 'également', 'partisan', 'sauver', 'article', 'contenu', 'favorable', 'fusion', 'toilettage', 'profond', 'article', 'Cdt', 'vrai', 'accumulation', 'bandeau', 'interpelle', 'fond', 'article', 'souffrir', 'bel', 'bien', 'multiple', 'problème', 'source', 'wikifier', 'ti', 'mesure', 'appuyer', 'thèse', 'ancien', 'important', 'essayer', 'expliquer', 'patiemment', 'auteur', 'manifestement', 'spécialiste', 'usage', 'encyclopédie', 'monde', 'sorte', 'gagnant', 'bonjour', 'bref', 'commencer', 'renommer', 'article', 'houille', 'blanc', 'Maurienne', 'trouver', 'source', 'secondaire', 'solide', 'traiter', 'sujet', 'résoudre', 'problème', 'signaler', 'voir', 'y', 'lieu', 'non', 'conserver', 'article', 'moment', 'page', 'apparaître', 'non', 'ti', 'copyvio', 'simple', 'fiche', 'lecture', 'résumé', 'thèse', 'Louis', 'Chabert', 'exception', 'dernier', 'section', 'constitue', 'sujet', 'admissible', 'absence', 'source', 'secondaire', 'thèse', 'caractère', 'simple', 'résumé', 'non', 'admissible', 'source', 'secondaire', 'indépendant', 'évaluer', 'thèse', 'ailleurs', 'confirmer', 'https://fr.wikipedia.org/w/index.php?title=industrie_de_la_houille_blanchediff=143234225oldid=143234146', 'commentaire', 'création', 'article']
2nd row['déplacement', 'bandeau', 'ti', 'section', 'douteux', 'qualifierez', 'vous', 'trop', 'enjouer', 'promotionnel', 'passage', 'citer', 'haut', 'page', 'réponse', 'bonjour', 'bien', 'problème', 'source', 'unique', 'évite', 'difficilement', 'générer', 'manque', 'neutralité', 'ici', 'haut', 'affaire', 'page', 'promouvoir', 'thèse', 'Louis', 'Chabert', 'constitue', 'sorte', 'fiche', 'lecture', 'non', 'critique', 'faute', 'source', 'secondaire', 'analyser', 'évaluer', 'thèse', 'souligner', 'thèse', 'Louis', 'Chabert', 'largement', 'financer', 'Péchiney', 'Ugine', 'Kuhlmann', 'voir', 'lien', 'indiquer', 'haut', 'expliquer', 'tendance', 'article', 'favoriser', 'touche', 'activité', 'groupe', 'très', 'présenter', 'article', 'conclusion', 'manque', 'neutralité', 'doute', 'problème', 'essentiel', 'article', 'créer', 'Bonjour', 'oui', 'dsl', 'on', 'demander', 'suppr', 'article', 'y', 'amélioration', 'bout', 'temps', 'coopération', 'Louis', 'Chabert', 'lacune', 'voir', 'état', 'paf', 'aboutir', 'argument', 'd', 'rrr_utilisateur', 'chabert', 'louis_rrr', 'chabert', 'Louis', 'actif', 'intervention', 'effectivement', 'nécessaire', 'assurer', 'conservation', 'article', 'bonjour', 'falloir', 'temps', 'lancer', 'procédure', 'suppression', 'rien', 'venir', 'falloir', 'remplacer', 'bandeau', 'info', 'cf.', 'https://fr.wikipedia.org/wiki/wikip%c3%a9dia:requ%c3%aate_aux_administrateursti_manifeste', '_', '1', 'immédiatement', 'création', 'bandeau', 'admis', 'poser', 'chabert', 'Louis', 'venir', 'réagir', 'rrr_utilisateur', 'chabert', 'louis_rrr', 'demande', 'temps', 'devoir', 'donner', 'article', 'potentiel', 'plaisir', 'lire']
3rd row['discussion', 'transférer', 'Wikipédia', 'page', 'fusionner', 'Bonjour', 'propose', 'retirer', 'jour', 'bandeau', 'fusion', 'archiver', 'pdd', 'respectif', 'procédure', 'ad', 'hoc', 'oui', 'RRR_Utilisateur', 'ot38_rrr', 'OT38', 'fusion', 'finalement', 'procédure', 'approprié', 'discussion', 'mérite', 'conserver', 'falloir', 'statuer', 'nuée', 'bandeau', 'copivio', 'admissibilité', 'etc.', 'historique', 'auteur', 'article', 'également', 'auteur', 'thèse', 'probable', 'thèse', '559', 'page', 'résume', 'grâce', 'copier', 'coller', 'bref', 'admissibilité', 'discute', 'pdd', 'fusion', 'contenu', 'évidence', 'gêne', 'beaucoup', 'article', 'industrie', 'houille', 'blanc', 'état', 'actuel', 'devoir', 'titrer', 'houille', 'blanc', 'Maurienne', 'coup', 'voir', 'bien', 'fusionner', 'article', 'appuyer', 'ailleurs', 'unique', 'source', 'contraire', 'demande', 'Wikipédia', 'risque', 'déséquilibrer', 'totalement', 'article', 'houille', 'blanc', 'vocation', 'traiter', 'industrie', 'houille', 'blanc', 'totalité', 'monde', 'Maurienne', 'France', 'Union', 'européen', 'monde', 'entier', 'moment', 'guère', 'ébauche', 'article', 'industrie', 'houille', 'blanc', 'Maurienne', 'commencer', 'démontrer', 'admissibilité', 'éventuel', 'renommage', 'résolution', 'problème', 'vouloir', 'insister', 'lancer', 'procédure', 'suppression', 'mieux', 'clôturer', 'procédure', 'lancer', 'réagir']
4th row['ajouter', 'référence', 'bibliographique', 'choix', 'arbitraire', 'difficulté', 'préférer', 'reporter', 'place', 'référence', 'thèse', 'résulte', 'référencer', 'fois', 'devenue', 'inutile', 'papeterie', 'charge', 'supprimer', 'dernier', 'part', '23', 'souvenir', 'bien', 'proposer', 'face', 'texte', 'gauche', 'écran', 'traitement', 'cosmétique', 'propre', 'terme', 'devoir', 'je', 'soumettre', 'texte', 'traitement', 'affaire', 'suppose', 'avoir', 'revoir', 'ensemble', 'texte', 'esprit', 'fois', 'opération', 'terminer', 'rester', '-t', 'il', 'faire', 'mettre', 'article', 'conformité', 'code', 'wikipedia', 'souhaite', 'bien', 'sûr', 'voir', 'bout', 'aide', 'zzznote', 'type', 'unsigned_non', 'signé|chabert', 'Louis|27', 'janvier', '2018', '17:02', 'CET)|144917352|notif=']
5th row['enrichir', 'article', 'commune', 'Maurienne', 'orelle', 'saint-etienne-de-cuine', 'créer', 'nouveau', 'source', 'article', 'houille', 'blanche--']

Common Values

ValueCountFrequency (%)
['discussion', 'ci-dessous']130
 
4.5%
[]62
 
2.2%
['message', 'déposer', 'automatiquement', 'robot']52
 
1.8%
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'récemment', 'inscrire', 'contribution', 'non', 'identifiable', 'IP', 'opinion', 'non', 'signer', 'principe', 'prendre', 'compte', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']48
 
1.7%
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'récemment', 'inscrire', 'contribution', 'non', 'identifiable', 'IP', 'principe', 'prendre', 'compte', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']39
 
1.4%
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'inscrire', 'contribution', 'non', 'identifiable', 'IP', 'principe', 'prendre', 'compte', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']30
 
1.0%
['exception', 'faire', 'créateur', 'article', 'avis', 'utilisateur', 'récemment', 'inscrire', 'contribution', 'non', 'identifiable', 'IPs', 'opinion', 'non', 'signer', 'principe', 'décompter', 'être', 'cas', 'pouvoir', 'toutefois', 'participer', 'discussion', 'exprimer', 'ci-dessous', 'information']13
 
0.5%
['message', 'déposer']10
 
0.3%
['_', '_', 'noinde', '_', '_', 'instruction']8
 
0.3%
['terme', 'tour']5
 
0.2%
Other values (2444)2477
86.2%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
article5217
 
2.4%
2651
 
1.2%
source2073
 
0.9%
y1584
 
0.7%
non1577
 
0.7%
faire1575
 
0.7%
bien1509
 
0.7%
avis1333
 
0.6%
page1283
 
0.6%
discussion1235
 
0.6%
Other values (19738)198998
90.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

interlocutors
Categorical

HIGH CARDINALITY

Distinct1766
Distinct (%)61.4%
Missing0
Missing (%)0.0%
Memory size355.0 KiB
['anonyme']
878 
['bot']
 
86
['anonyme', 'anonyme']
 
32
['ℳ𝒄𝓛𝒖𝒔𝒉FR']
 
7
['Azurfrog']
 
7
Other values (1761)
1864 

Length

Max length4807
Median length18
Mean length55.2466945
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1694 ?
Unique (%)58.9%

Sample

1st row['Borvan53', 'anonyme', 'anonyme', 'OT38', 'Binabik', 'Azurfrog']
2nd row['OT38', 'OT38', 'Azurfrog', 'OT38', 'Azurfrog', 'Borvan53', 'Azurfrog', 'OT38', 'Borvan53']
3rd row['anonyme', 'OT38', 'Borvan53', 'Borvan53', 'Azurfrog', 'Azurfrog', 'OT38', 'Nouill']
4th row['bot']
5th row['CHABERT Louis']

Common Values

ValueCountFrequency (%)
['anonyme']878
30.5%
['bot']86
 
3.0%
['anonyme', 'anonyme']32
 
1.1%
['ℳ𝒄𝓛𝒖𝒔𝒉FR']7
 
0.2%
['Azurfrog']7
 
0.2%
['Gemini1980']6
 
0.2%
['Patrick Rogel']6
 
0.2%
['Valérie75']5
 
0.2%
['anonyme', 'anonyme', 'anonyme']5
 
0.2%
['Christophe95']4
 
0.1%
Other values (1756)1838
64.0%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
anonyme3706
 
25.1%
liege126
 
0.9%
chris126
 
0.9%
a126
 
0.9%
schlum116
 
0.8%
elnon114
 
0.8%
bot104
 
0.7%
patrick94
 
0.6%
rogel94
 
0.6%
unsigned93
 
0.6%
Other values (1922)10041
68.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

n_interlocutors
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct60
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.453723034
Minimum1
Maximum437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile15
Maximum437
Range436
Interquartile range (IQR)3

Descriptive statistics

Standard deviation11.52131977
Coefficient of variation (CV)2.58689633
Kurtosis723.0871271
Mean4.453723034
Median Absolute Deviation (MAD)0
Skewness21.42687354
Sum12800
Variance132.7408093
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11481
51.5%
2349
 
12.1%
3197
 
6.9%
4131
 
4.6%
5116
 
4.0%
779
 
2.7%
672
 
2.5%
872
 
2.5%
952
 
1.8%
1150
 
1.7%
Other values (50)275
 
9.6%
ValueCountFrequency (%)
11481
51.5%
2349
 
12.1%
3197
 
6.9%
4131
 
4.6%
5116
 
4.0%
672
 
2.5%
779
 
2.7%
872
 
2.5%
952
 
1.8%
1044
 
1.5%
ValueCountFrequency (%)
4371
< 0.1%
1901
< 0.1%
1291
< 0.1%
781
< 0.1%
742
0.1%
731
< 0.1%
711
< 0.1%
641
< 0.1%
631
< 0.1%
601
< 0.1%

dates
Categorical

HIGH CARDINALITY

Distinct2170
Distinct (%)75.5%
Missing0
Missing (%)0.0%
Memory size365.5 KiB
[None]
674 
[None, None]
 
16
[None, None, None]
 
6
['2017-05-18T02:36']
 
3
[None, '2011-03-13T17:02', '2011-03-13T17:02', None, '2011-03-13T15:14', '2011-03-18T22:00', '2011-03-13T12:29', '2011-03-13T15:14', '2011-03-18T22:00']
 
2
Other values (2165)
2173 

Length

Max length2622
Median length20
Mean length73.17466945
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2157 ?
Unique (%)75.1%

Sample

1st row['2017-12-06T15:56', None, None, '2017-12-06T16:48', '2017-12-07T20:57', '2017-12-12T09:18']
2nd row['2017-12-12T09:52', None, '2017-12-12T10:04', '2017-12-12T10:15', None, '2017-12-12T12:15', '2017-12-12T12:37', '2017-12-12T12:52', '2017-12-12T23:21']
3rd row[None, '2017-12-18T14:45', '2017-12-19T10:30', '2017-12-06T15:44', '2017-12-12T09:30', '2017-12-12T08:57', '2017-12-12T09:03', '2017-12-23T21:26']
4th row['2018-01-27T17:02']
5th row['2018-02-02T22:11']

Common Values

ValueCountFrequency (%)
[None]674
 
23.5%
[None, None]16
 
0.6%
[None, None, None]6
 
0.2%
['2017-05-18T02:36']3
 
0.1%
[None, '2011-03-13T17:02', '2011-03-13T17:02', None, '2011-03-13T15:14', '2011-03-18T22:00', '2011-03-13T12:29', '2011-03-13T15:14', '2011-03-18T22:00']2
 
0.1%
['2018-09-08T00:31']2
 
0.1%
['2010-08-06T13:13']2
 
0.1%
['2008-04-13T10:34']2
 
0.1%
['2017-09-05T14:43']2
 
0.1%
['2008-06-26T03:11', '2008-06-26T04:23', '2008-06-26T08:14']2
 
0.1%
Other values (2160)2163
75.3%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
none3264
 
25.5%
2010-01-25t00:4627
 
0.2%
2012-10-21t18:1027
 
0.2%
2016-12-20t00:4115
 
0.1%
2007-11-23t00:149
 
0.1%
2016-12-20t01:279
 
0.1%
2014-11-18t22:089
 
0.1%
2011-07-01t10:339
 
0.1%
2012-10-24t02:267
 
0.1%
2006-03-20t23:506
 
< 0.1%
Other values (5450)9418
73.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

n_anonymes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct27
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.289491997
Minimum0
Maximum437
Zeros1166
Zeros (%)40.6%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum437
Range437
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.159778728
Coefficient of variation (CV)7.103400989
Kurtosis1845.181063
Mean1.289491997
Median Absolute Deviation (MAD)1
Skewness40.68570103
Sum3706
Variance83.90154635
MonotonicityNot monotonic
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
11196
41.6%
01166
40.6%
2233
 
8.1%
3147
 
5.1%
460
 
2.1%
522
 
0.8%
610
 
0.3%
79
 
0.3%
87
 
0.2%
106
 
0.2%
Other values (17)18
 
0.6%
ValueCountFrequency (%)
01166
40.6%
11196
41.6%
2233
 
8.1%
3147
 
5.1%
460
 
2.1%
522
 
0.8%
610
 
0.3%
79
 
0.3%
87
 
0.2%
91
 
< 0.1%
ValueCountFrequency (%)
4371
< 0.1%
1891
< 0.1%
521
< 0.1%
491
< 0.1%
461
< 0.1%
311
< 0.1%
271
< 0.1%
261
< 0.1%
251
< 0.1%
241
< 0.1%

n_bots
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size162.9 KiB
0
2775 
1
 
94
2
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
02775
96.6%
194
 
3.3%
25
 
0.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
02775
96.6%
194
 
3.3%
25
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

mean_post_per_interlocutor
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct116
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8543294176
Minimum0
Maximum10.25
Zeros1007
Zeros (%)35.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile2.142857143
Maximum10.25
Range10.25
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8824121992
Coefficient of variation (CV)1.03287114
Kurtosis14.72205936
Mean0.8543294176
Median Absolute Deviation (MAD)0.3333333333
Skewness2.51616118
Sum2455.342746
Variance0.7786512892
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11278
44.5%
01007
35.0%
2103
 
3.6%
1.588
 
3.1%
1.33333333340
 
1.4%
1.2530
 
1.0%
323
 
0.8%
1.416
 
0.6%
2.516
 
0.6%
1.66666666715
 
0.5%
Other values (106)258
 
9.0%
ValueCountFrequency (%)
01007
35.0%
11278
44.5%
1.0294117651
 
< 0.1%
1.0526315792
 
0.1%
1.06251
 
< 0.1%
1.0666666671
 
< 0.1%
1.0714285712
 
0.1%
1.0769230771
 
< 0.1%
1.0833333332
 
0.1%
1.0909090917
 
0.2%
ValueCountFrequency (%)
10.251
< 0.1%
9.51
< 0.1%
7.3333333331
< 0.1%
72
0.1%
6.81
< 0.1%
6.51
< 0.1%
6.21
< 0.1%
61
< 0.1%
5.81
< 0.1%
5.6666666671
< 0.1%

mean_post_per_interlocutor_with_anonymous
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct136
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.479514499
Minimum0
Maximum437
Zeros86
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31.25
95-th percentile2.463888889
Maximum437
Range437
Interquartile range (IQR)0.25

Descriptive statistics

Standard deviation8.459169808
Coefficient of variation (CV)5.71753086
Kurtosis2453.668134
Mean1.479514499
Median Absolute Deviation (MAD)0
Skewness48.14186938
Sum4252.124671
Variance71.55755384
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11940
67.5%
2127
 
4.4%
1.5119
 
4.1%
086
 
3.0%
1.2555
 
1.9%
1.33333333354
 
1.9%
1.66666666738
 
1.3%
327
 
0.9%
1.627
 
0.9%
1.426
 
0.9%
Other values (126)375
 
13.0%
ValueCountFrequency (%)
086
 
3.0%
11940
67.5%
1.0384615381
 
< 0.1%
1.06252
 
0.1%
1.0714285711
 
< 0.1%
1.0833333334
 
0.1%
1.0909090914
 
0.1%
1.112
 
0.4%
1.1111111113
 
0.1%
1.1256
 
0.2%
ValueCountFrequency (%)
4371
< 0.1%
951
< 0.1%
521
< 0.1%
461
< 0.1%
241
< 0.1%
14.61
< 0.1%
141
< 0.1%
10.666666671
< 0.1%
101
< 0.1%
9.81
< 0.1%

max_post_per_interlocutor_with_anonymous
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct30
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.049756437
Minimum1
Maximum437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum437
Range436
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.222505442
Coefficient of variation (CV)4.499317712
Kurtosis1782.396093
Mean2.049756437
Median Absolute Deviation (MAD)0
Skewness39.66133588
Sum5891
Variance85.05460662
MonotonicityNot monotonic
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
12023
70.4%
2385
 
13.4%
3209
 
7.3%
4106
 
3.7%
540
 
1.4%
628
 
1.0%
718
 
0.6%
813
 
0.5%
1010
 
0.3%
97
 
0.2%
Other values (20)35
 
1.2%
ValueCountFrequency (%)
12023
70.4%
2385
 
13.4%
3209
 
7.3%
4106
 
3.7%
540
 
1.4%
628
 
1.0%
718
 
0.6%
813
 
0.5%
97
 
0.2%
1010
 
0.3%
ValueCountFrequency (%)
4371
< 0.1%
1891
< 0.1%
521
< 0.1%
491
< 0.1%
461
< 0.1%
381
< 0.1%
311
< 0.1%
271
< 0.1%
261
< 0.1%
251
< 0.1%

n_tokens
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct367
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean76.19102296
Minimum0
Maximum5604
Zeros62
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile2
Q17
median26
Q370
95-th percentile290.35
Maximum5604
Range5604
Interquartile range (IQR)63

Descriptive statistics

Standard deviation201.8889962
Coefficient of variation (CV)2.649774059
Kurtosis258.0328356
Mean76.19102296
Median Absolute Deviation (MAD)21
Skewness12.5526167
Sum218973
Variance40759.16677
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2178
 
6.2%
7151
 
5.3%
6118
 
4.1%
491
 
3.2%
2774
 
2.6%
2472
 
2.5%
062
 
2.2%
560
 
2.1%
855
 
1.9%
1053
 
1.8%
Other values (357)1960
68.2%
ValueCountFrequency (%)
062
 
2.2%
121
 
0.7%
2178
6.2%
347
 
1.6%
491
3.2%
560
 
2.1%
6118
4.1%
7151
5.3%
855
 
1.9%
949
 
1.7%
ValueCountFrequency (%)
56041
< 0.1%
33761
< 0.1%
33541
< 0.1%
21081
< 0.1%
20061
< 0.1%
18511
< 0.1%
16671
< 0.1%
15131
< 0.1%
13371
< 0.1%
12521
< 0.1%

mean_tokens
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct739
Distinct (%)25.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.00225457
Minimum0
Maximum282
Zeros62
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile2
Q16
median12
Q323.075
95-th percentile51
Maximum282
Range282
Interquartile range (IQR)17.075

Descriptive statistics

Standard deviation21.84035879
Coefficient of variation (CV)1.213201308
Kurtosis33.9727512
Mean18.00225457
Median Absolute Deviation (MAD)7.5
Skewness4.55091528
Sum51738.47963
Variance477.0012721
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2184
 
6.4%
7164
 
5.7%
6121
 
4.2%
4103
 
3.6%
571
 
2.5%
062
 
2.2%
2760
 
2.1%
2459
 
2.1%
856
 
1.9%
956
 
1.9%
Other values (729)1938
67.4%
ValueCountFrequency (%)
062
2.2%
0.16666666671
 
< 0.1%
0.18181818181
 
< 0.1%
0.18421052631
 
< 0.1%
0.51
 
< 0.1%
0.66666666671
 
< 0.1%
0.71428571431
 
< 0.1%
120
 
0.7%
1.21
 
< 0.1%
1.53
 
0.1%
ValueCountFrequency (%)
2821
< 0.1%
2771
< 0.1%
2451
< 0.1%
2241
< 0.1%
2231
< 0.1%
1931
< 0.1%
1921
< 0.1%
1811
< 0.1%
1762
0.1%
1631
< 0.1%

min_tokens
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct107
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.77139875
Minimum0
Maximum282
Zeros358
Zeros (%)12.5%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q314
95-th percentile40
Maximum282
Range282
Interquartile range (IQR)12

Descriptive statistics

Standard deviation19.99635199
Coefficient of variation (CV)1.698723526
Kurtosis39.29017005
Mean11.77139875
Median Absolute Deviation (MAD)4
Skewness4.995187932
Sum33831
Variance399.8540929
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0358
 
12.5%
1338
 
11.8%
2338
 
11.8%
7199
 
6.9%
6158
 
5.5%
4151
 
5.3%
3147
 
5.1%
5131
 
4.6%
974
 
2.6%
873
 
2.5%
Other values (97)907
31.6%
ValueCountFrequency (%)
0358
12.5%
1338
11.8%
2338
11.8%
3147
5.1%
4151
5.3%
5131
 
4.6%
6158
5.5%
7199
6.9%
873
 
2.5%
974
 
2.6%
ValueCountFrequency (%)
2821
< 0.1%
2451
< 0.1%
2241
< 0.1%
1931
< 0.1%
1921
< 0.1%
1811
< 0.1%
1762
0.1%
1631
< 0.1%
1501
< 0.1%
1451
< 0.1%

max_tokens
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct165
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.01704941
Minimum0
Maximum436
Zeros62
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile2
Q17
median21
Q335
95-th percentile95
Maximum436
Range436
Interquartile range (IQR)28

Descriptive statistics

Standard deviation37.98593803
Coefficient of variation (CV)1.265478745
Kurtosis21.56568885
Mean30.01704941
Median Absolute Deviation (MAD)14
Skewness3.71489921
Sum86269
Variance1442.931488
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7183
 
6.4%
2179
 
6.2%
27148
 
5.1%
6140
 
4.9%
24121
 
4.2%
4101
 
3.5%
2378
 
2.7%
574
 
2.6%
1866
 
2.3%
965
 
2.3%
Other values (155)1719
59.8%
ValueCountFrequency (%)
062
 
2.2%
123
 
0.8%
2179
6.2%
353
 
1.8%
4101
3.5%
574
2.6%
6140
4.9%
7183
6.4%
858
 
2.0%
965
 
2.3%
ValueCountFrequency (%)
4361
 
< 0.1%
4031
 
< 0.1%
3721
 
< 0.1%
3381
 
< 0.1%
3004
0.1%
2941
 
< 0.1%
2921
 
< 0.1%
2891
 
< 0.1%
2821
 
< 0.1%
2641
 
< 0.1%

n_tokens_stopwords
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct576
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean167.6002088
Minimum0
Maximum8121
Zeros59
Zeros (%)2.1%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile4
Q112
median53
Q3149
95-th percentile660.1
Maximum8121
Range8121
Interquartile range (IQR)137

Descriptive statistics

Standard deviation422.6445119
Coefficient of variation (CV)2.521742157
Kurtosis117.593642
Mean167.6002088
Median Absolute Deviation (MAD)46
Skewness8.828213253
Sum481683
Variance178628.3834
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5184
 
6.4%
7182
 
6.3%
692
 
3.2%
5063
 
2.2%
059
 
2.1%
5245
 
1.6%
4944
 
1.5%
334
 
1.2%
830
 
1.0%
6626
 
0.9%
Other values (566)2115
73.6%
ValueCountFrequency (%)
059
 
2.1%
116
 
0.6%
224
 
0.8%
334
 
1.2%
417
 
0.6%
5184
6.4%
692
3.2%
7182
6.3%
830
 
1.0%
921
 
0.7%
ValueCountFrequency (%)
81211
< 0.1%
74941
< 0.1%
66131
< 0.1%
52281
< 0.1%
46941
< 0.1%
45061
< 0.1%
36151
< 0.1%
32741
< 0.1%
31051
< 0.1%
30121
< 0.1%

mean_tokens_stopwords
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct941
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.0787808
Minimum0
Maximum651.5
Zeros59
Zeros (%)2.1%
Negative0
Negative (%)0.0%
Memory size22.6 KiB

Quantile statistics

Minimum0
5-th percentile3.144444444
Q18
median24.95454545
Q350
95-th percentile114.675
Maximum651.5
Range651.5
Interquartile range (IQR)42

Descriptive statistics

Standard deviation47.87027887
Coefficient of variation (CV)1.257137909
Kurtosis32.51514078
Mean38.0787808
Median Absolute Deviation (MAD)17.95454545
Skewness4.39676692
Sum109438.416
Variance2291.563599
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7187
 
6.5%
5183
 
6.4%
698
 
3.4%
5061
 
2.1%
059
 
2.1%
5242
 
1.5%
4938
 
1.3%
836
 
1.3%
336
 
1.3%
1332
 
1.1%
Other values (931)2102
73.1%
ValueCountFrequency (%)
059
2.1%
0.16666666671
 
< 0.1%
0.18181818181
 
< 0.1%
0.41052631581
 
< 0.1%
0.51
 
< 0.1%
0.66666666671
 
< 0.1%
0.71428571431
 
< 0.1%
114
 
0.5%
223
 
0.8%
2.54
 
0.1%
ValueCountFrequency (%)
651.51
< 0.1%
555.51
< 0.1%
5441
< 0.1%
4681
< 0.1%
4481
< 0.1%
4461
< 0.1%
4241
< 0.1%
3811
< 0.1%
3652
0.1%
343.51
< 0.1%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

titleid_threadn_poststokensinterlocutorsn_interlocutorsdatesn_anonymesn_botsmean_post_per_interlocutormean_post_per_interlocutor_with_anonymousmax_post_per_interlocutor_with_anonymousn_tokensmean_tokensmin_tokensmax_tokensn_tokens_stopwordsmean_tokens_stopwords
0Bandeaux à foison11324890_16['rarement', 'article', 'fournir', 'bandeau', 'relever', 'article', 'respecte', 'doute', 'guère', 'grammaire', 'wikipédienn', 'lucky', 'Luke', 'bandeau', 'encombrer', 'guère', 'bon', 'usage', 'absence', 'débat', 'pdd', 'regrettable', 'y', 'page', 'fort', 'utile', 'bref', 'article', 'éminemment', 'perfectible', 'forme', 'diversité', 'sourçage', 'sauver', 'avis', 'Bonjour', 'poseur', 'bandeau', 'article', 'admissibilité', 'article', 'préciser', 'chose', 'considérer', 'lucky', 'Luke', 'bandeau', 'répondre', 'bandeau', 'poser', 'Copyvio', 'contributeur', 'remercier', 'rédaction', 'fournir', 'lien', 'accès', 'ressource', 'lien', 'accès', 'ressource', 'thèse', '559', 'page', 'y', 'copier', 'coller', 'section', 'ti', 'Complétons', 'fin', 'ri', 'Wikipédia', 'travail', 'inédit', 'opinion', 'excessivement', 'minoritaire', 'associer', 'source', 'juger', 'confidentiel', 'fiable', 'voire', 'simplement', 'interprétation', 'déduction', 'intuition', 'personnel', 'rédacteur', 'article', 'exemple', 'section', 'évolution', 'jour', 'facteur', 'jouer', 'faveur', 'vallée', 'fin', 'xix', 'siècle', 'retourner', 'inconvénient', 'disparaître', 'rente', 'énergétique', 'commander', 'couplage', 'usine', 'centrale', 'hydroélectrique', 'section', 'glorieux', 'conglomérat', 'omniprésent', 'vallée', 'falloir', 'oublier', 'envergure', 'non', 'national', 'exercer', 'stratégie', 'champ', 'mondial', 'doute', 'solidité', 'ancrage', 'nord-alpin', 'section', '1914', '1939', 'mention', 'spécial', 'faire', 'usine', 'Epierre', 'bas', 'Maurienne', 'four', 'électrique', 'origine', 'jusqu’', 'fermeture', 'consacrer', 'fabrication', 'dérivé', 'phosphore', 'agir', 'relocalisation', 'firme', 'Coignet', 'origine', 'lyonnais', 'terme', 'pérégrination', 'bandeau', 'sourçage', 'article', 'mérite', 'certainement', 'mieux', 'source', 'bien', 'qualité', 'unique', 'source', 'fête', '40', 'an', 'poser', 'problématique', 'mise', 'jour', 'également', 'partisan', 'sauver', 'article', 'contenu', 'favorable', 'fusion', 'toilettage', 'profond', 'article', 'Cdt', 'vrai', 'accumulation', 'bandeau', 'interpelle', 'fond', 'article', 'souffrir', 'bel', 'bien', 'multiple', 'problème', 'source', 'wikifier', 'ti', 'mesure', 'appuyer', 'thèse', 'ancien', 'important', 'essayer', 'expliquer', 'patiemment', 'auteur', 'manifestement', 'spécialiste', 'usage', 'encyclopédie', 'monde', 'sorte', 'gagnant', 'bonjour', 'bref', 'commencer', 'renommer', 'article', 'houille', 'blanc', 'Maurienne', 'trouver', 'source', 'secondaire', 'solide', 'traiter', 'sujet', 'résoudre', 'problème', 'signaler', 'voir', 'y', 'lieu', 'non', 'conserver', 'article', 'moment', 'page', 'apparaître', 'non', 'ti', 'copyvio', 'simple', 'fiche', 'lecture', 'résumé', 'thèse', 'Louis', 'Chabert', 'exception', 'dernier', 'section', 'constitue', 'sujet', 'admissible', 'absence', 'source', 'secondaire', 'thèse', 'caractère', 'simple', 'résumé', 'non', 'admissible', 'source', 'secondaire', 'indépendant', 'évaluer', 'thèse', 'ailleurs', 'confirmer', 'https://fr.wikipedia.org/w/index.php?title=industrie_de_la_houille_blanchediff=143234225oldid=143234146', 'commentaire', 'création', 'article']['Borvan53', 'anonyme', 'anonyme', 'OT38', 'Binabik', 'Azurfrog']6['2017-12-06T15:56', None, None, '2017-12-06T16:48', '2017-12-07T20:57', '2017-12-12T09:18']201.001.2227846.3333332965635105.833333
1Ton de l'article11324890_29['déplacement', 'bandeau', 'ti', 'section', 'douteux', 'qualifierez', 'vous', 'trop', 'enjouer', 'promotionnel', 'passage', 'citer', 'haut', 'page', 'réponse', 'bonjour', 'bien', 'problème', 'source', 'unique', 'évite', 'difficilement', 'générer', 'manque', 'neutralité', 'ici', 'haut', 'affaire', 'page', 'promouvoir', 'thèse', 'Louis', 'Chabert', 'constitue', 'sorte', 'fiche', 'lecture', 'non', 'critique', 'faute', 'source', 'secondaire', 'analyser', 'évaluer', 'thèse', 'souligner', 'thèse', 'Louis', 'Chabert', 'largement', 'financer', 'Péchiney', 'Ugine', 'Kuhlmann', 'voir', 'lien', 'indiquer', 'haut', 'expliquer', 'tendance', 'article', 'favoriser', 'touche', 'activité', 'groupe', 'très', 'présenter', 'article', 'conclusion', 'manque', 'neutralité', 'doute', 'problème', 'essentiel', 'article', 'créer', 'Bonjour', 'oui', 'dsl', 'on', 'demander', 'suppr', 'article', 'y', 'amélioration', 'bout', 'temps', 'coopération', 'Louis', 'Chabert', 'lacune', 'voir', 'état', 'paf', 'aboutir', 'argument', 'd', 'rrr_utilisateur', 'chabert', 'louis_rrr', 'chabert', 'Louis', 'actif', 'intervention', 'effectivement', 'nécessaire', 'assurer', 'conservation', 'article', 'bonjour', 'falloir', 'temps', 'lancer', 'procédure', 'suppression', 'rien', 'venir', 'falloir', 'remplacer', 'bandeau', 'info', 'cf.', 'https://fr.wikipedia.org/wiki/wikip%c3%a9dia:requ%c3%aate_aux_administrateursti_manifeste', '_', '1', 'immédiatement', 'création', 'bandeau', 'admis', 'poser', 'chabert', 'Louis', 'venir', 'réagir', 'rrr_utilisateur', 'chabert', 'louis_rrr', 'demande', 'temps', 'devoir', 'donner', 'article', 'potentiel', 'plaisir', 'lire']['OT38', 'OT38', 'Azurfrog', 'OT38', 'Azurfrog', 'Borvan53', 'Azurfrog', 'OT38', 'Borvan53']9['2017-12-12T09:52', None, '2017-12-12T10:04', '2017-12-12T10:15', None, '2017-12-12T12:15', '2017-12-12T12:37', '2017-12-12T12:52', '2017-12-12T23:21']003.003.0414516.11111116033136.777778
2Proposition de fusion entre [[Industrie de la houille blanche]] et [[Houille blanche]]11324890_38['discussion', 'transférer', 'Wikipédia', 'page', 'fusionner', 'Bonjour', 'propose', 'retirer', 'jour', 'bandeau', 'fusion', 'archiver', 'pdd', 'respectif', 'procédure', 'ad', 'hoc', 'oui', 'RRR_Utilisateur', 'ot38_rrr', 'OT38', 'fusion', 'finalement', 'procédure', 'approprié', 'discussion', 'mérite', 'conserver', 'falloir', 'statuer', 'nuée', 'bandeau', 'copivio', 'admissibilité', 'etc.', 'historique', 'auteur', 'article', 'également', 'auteur', 'thèse', 'probable', 'thèse', '559', 'page', 'résume', 'grâce', 'copier', 'coller', 'bref', 'admissibilité', 'discute', 'pdd', 'fusion', 'contenu', 'évidence', 'gêne', 'beaucoup', 'article', 'industrie', 'houille', 'blanc', 'état', 'actuel', 'devoir', 'titrer', 'houille', 'blanc', 'Maurienne', 'coup', 'voir', 'bien', 'fusionner', 'article', 'appuyer', 'ailleurs', 'unique', 'source', 'contraire', 'demande', 'Wikipédia', 'risque', 'déséquilibrer', 'totalement', 'article', 'houille', 'blanc', 'vocation', 'traiter', 'industrie', 'houille', 'blanc', 'totalité', 'monde', 'Maurienne', 'France', 'Union', 'européen', 'monde', 'entier', 'moment', 'guère', 'ébauche', 'article', 'industrie', 'houille', 'blanc', 'Maurienne', 'commencer', 'démontrer', 'admissibilité', 'éventuel', 'renommage', 'résolution', 'problème', 'vouloir', 'insister', 'lancer', 'procédure', 'suppression', 'mieux', 'clôturer', 'procédure', 'lancer', 'réagir']['anonyme', 'OT38', 'Borvan53', 'Borvan53', 'Azurfrog', 'Azurfrog', 'OT38', 'Nouill']8[None, '2017-12-18T14:45', '2017-12-19T10:30', '2017-12-06T15:44', '2017-12-12T09:30', '2017-12-12T08:57', '2017-12-12T09:03', '2017-12-23T21:26']101.751.6212515.62500005729336.625000
3mon article industries de la houille blanche en Maurienne11324890_41['ajouter', 'référence', 'bibliographique', 'choix', 'arbitraire', 'difficulté', 'préférer', 'reporter', 'place', 'référence', 'thèse', 'résulte', 'référencer', 'fois', 'devenue', 'inutile', 'papeterie', 'charge', 'supprimer', 'dernier', 'part', '23', 'souvenir', 'bien', 'proposer', 'face', 'texte', 'gauche', 'écran', 'traitement', 'cosmétique', 'propre', 'terme', 'devoir', 'je', 'soumettre', 'texte', 'traitement', 'affaire', 'suppose', 'avoir', 'revoir', 'ensemble', 'texte', 'esprit', 'fois', 'opération', 'terminer', 'rester', '-t', 'il', 'faire', 'mettre', 'article', 'conformité', 'code', 'wikipedia', 'souhaite', 'bien', 'sûr', 'voir', 'bout', 'aide', 'zzznote', 'type', 'unsigned_non', 'signé|chabert', 'Louis|27', 'janvier', '2018', '17:02', 'CET)|144917352|notif=']['bot']1['2018-01-27T17:02']010.000.017272.0000007272156156.000000
4articles enrichis11324890_51['enrichir', 'article', 'commune', 'Maurienne', 'orelle', 'saint-etienne-de-cuine', 'créer', 'nouveau', 'source', 'article', 'houille', 'blanche--']['CHABERT Louis']1['2018-02-02T22:11']001.001.011212.00000012122323.000000
5None6656380_11['avertissement', 'Homonymie', '|', 'revisionid=116606866', '|', 'tharsi', '--']['anonyme']1['2015-08-08T00:46']100.001.0177.0000007777.000000
6None8038650_11['Sourcer', 'jamais', 'nationalité', 'italien', 'malgré', 'mari', 'italien', 'enfant', 'italien', 'carrière', 'acteur', 'cinéma', 'théâtre', 'italien', 'sourcer', 'acquisition', 'nationalité', 'italien']['anonyme']1[None]100.001.011818.00000018183535.000000
7Revoil6358260_16['Bonjour', 'suite', 'recherche', 'généalogique', 'famille', 'possession', 'acte', 'mariage', 'peintre', 'Pierre', 'Revoil', 'naître', '12/06/1776', 'Lyon', 'directeur', 'beau', 'art', 'Lyon', 'épouser', 'Joséphine', 'Henriette', 'Révoil', 'nièce', 'mineure', 'Aix-en-Provence', '14', 'janvier', '1816', 'sœur', 'aîné', 'Louise', 'acte', 'disponible', 'ligne', 'site', 'archive', 'départemental', 'bouche', 'Rhône', 'précise', 'note', '1', 'père', 'Louise', 'frère', 'natif', 'Lyon', 'directeur', 'poste', 'Aix-en-Provence', 'épouser', 'Henriette', 'Leblanc', 'Servanne', 'fille', 'Jean', 'Baptiste', 'Benoit', 'Leblanc', 'Servanne', '1738', '1822', 'Marguerite', 'Rousseau', 'héritier', 'père', 'chateau', 'Servannes', 'situer', 'pied', 'Alpilles', 'campagne', 'village', 'Mouriès', '25', 'kilomètre', 'Arles', '60', 'kilomètre', 'Ouest', 'Aix', 'Bonjour', 'renseignement', 'très', 'intéressant', 'savoir', 'peut-être', 'principe', 'Wikipédia', 'repose', 'source', 'fiable', 'figure', 'article', 'absolument', 'publier', 'manière', 'fiable', 'ailleurs', 'savoir', 'vous', 'existe', 'ouvrage', 'reprendre', 'information', 'cordialement', 'Bonjour', 'principal', 'référence', 'Louise', 'Colet', 'Joseph', 'S.', 'Jackson', 'Louise', 'Colet', 'ami', 'littéraire', 'Yales', 'Romanic', '1937', 'information', 'publier', 'page', 'Pierre', 'Révoil', 'Louise', 'Colet', 'issu', 'archive', 'bdr', 'y', 'accès', 'direct', 'page', 'concerner', 'vouloir', 'vérifier', 'cliquer', 'http://doris.archives13.fr/dorisuec/jsp/system/win_main.jsp', 'choisir', 'Aix', 'registre', 'paroissial', 'état', 'civil', 'rechercher', 'nouveau', 'page', 'cliquer', 'mariage', '“', 'entrer', '1816', 'fois', 'case', 'nouveau', 'cliquer', 'bouton', 'rechercher', 'être', 'registre', 'vouloir', 'falloir', 'aller', 'page', '98', '99.--', 'oui', 'régulièrement', 'site', 'AD13', 'problème', 'source', 'dire', 'source', 'primaire', 'acceptable', 'Wikipédia', 'soumettre', 'analyse', 'critique', 'renvoyer', 'détail', 'page', 'Wikipédia', 'source', 'primaire', 'secondaire', 'idéal', 'texte', 'publier', 'sujet', 'référence', 'avoir', 'citer', 'intéressant', 'cordialement', 'bonsoir', 'commentaire', 'source', 'primaire', 'secondaire', 'cas', 'venir', 'information', 'Louise', 'Colet', 'sœur', 'Pierre', 'Révoil', 'document', 'archive', 'disponible', 'belle-sœur', 'Pierre', 'Révoil', 'fille', 'benjamin', 'cousin', 'vérifier', 'apparaître', 'Joseph', 'S.', 'Jackson', 'Louise', 'Colet', 'ami', 'littéraire', 'Yale', 'Romanic', 'xv', '1937', 'courant', 'bien', 'cordialement', 'Nella', 'scheda', 'è', 'scritto', 'ch', 'figlia', 'non', 'è', 'stater', 'riconosciuter', 'marito', 'però', 'sul', 'sito', 'del', 'comune', 'di', 'Parigi', 'risulta', 'una', 'Colet', 'Henriette', 'Suzanne', 'nata', '16', 'luglio', '1840', 'quindi', 'meno', 'di', 'disconoscimento', 'successivo', 'di', 'paternità', 'bambina', 'è', 'stata', 'registrata', 'con', 'cognome', 'Colet', 'http://canadp-archivesenligne.paris.fr/archives_etat_civil/avant_1860_fichiers_etat_civil_reconstitue/fecr_visu_img.php?registre=v3e_n_0518type=ecrfbdd_en_cours=etat_civil_rec_fichiersvue_tranche_debut=ad075er_5mi20785_00604_cvue_tranche_fin=ad075er_5mi20785_00653_cref_histo=55684cote=v3e/n', '518']['Claude.martin', 'Malost', 'Claude.martin', 'Malost', 'Claude.martin', 'anonyme']6['2012-06-12T10:42', '2012-06-12T10:58', '2012-06-12T13:17', '2012-06-12T13:25', '2012-06-12T23:43', None]102.502.0327946.500000258151185.166667
8« de Servannes »6358260_21['Bonjour', 'quelle(s', 'source(s', 'provenir', 'Louise', 'naître', 'Révoil']['anonyme']1[None]100.001.0177.000000771212.000000
9None1855970_11['renommer', 'critère', 'métrisabilité']['Anne Bauval']1['2010-06-15T21:03']001.001.0133.0000003377.000000

Last rows

titleid_threadn_poststokensinterlocutorsn_interlocutorsdatesn_anonymesn_botsmean_post_per_interlocutormean_post_per_interlocutor_with_anonymousmax_post_per_interlocutor_with_anonymousn_tokensmean_tokensmin_tokensmax_tokensn_tokens_stopwordsmean_tokens_stopwords
2864[[gothisme]] et [[mouvement gothique]]4884870_59['page', 'gothisme', 'créer', 'récemment', 'sujet', 'controverse', 'page', 'discussion', 'création', 'discuter', 'Gothisme', 'peut-être', 'discussion', 'ici', 'avis', 'extérieur', 'aboutir', 'solution', 'agir', 'bien', 'jargon', 'rien', 'fusion', 'mouvement', 'gothique', 'moindre', 'mal', 'avoir', 'préconiser', 'suppression', 'pur', 'simple', 'remarqu', 'concret', 'perdre', 'temps', 'propose', 'absence', 'opposition', 'appuyer', 'référence', 'redirect', 'mettre', 'place', 'ici', 'semaine', 'fusion', 'effectuer', 'jour']['Sand', 'Darkline', 'Crobard', 'Agarwaen', 'GL', 'Case', 'Enkahel', 'GL', 'Sand']9['2005-11-01T08:30', '2005-11-01T09:05', '2005-11-01T10:41', '2005-11-01T11:36', '2005-11-01T11:55', '2005-11-01T12:11', '2005-11-01T13:32', '2005-11-01T11:55', '2005-11-07T07:41']001.2857141.2857142495.44444401810111.222222
2865None1250500_11['falloir', 'minimum', 'mettre', 'titre', 'tableau', 'présentable', 'savoir', 'agir', 'savoir', 'nommer', 'colonne']['Isaac Sanolnacov']1['2007-01-03T13:00']001.0000001.00000011111.00000011113434.000000
2866Janine / Jeannine ?1898590_11['regardez', 'page', 'http://biographiesartistesquebecois.com/artiste-b/bergeronjano/bergeronjano.html', 'vrai', 'nom', 'épelle', 'Jeannine', 'Janine', 'savoir', 'vous', 'orthographe', 'correct', '-andy']['217.50.59.156']1['2010-12-17T17:41']001.0000001.00000011313.00000013132525.000000
2867None1892380_11['Bonjour', 'indiquer', 'article', 'figure', 'contre', 'aide', 'mieux', 'comprendre', 'jouer', 'paramètre', 'trouve', 'figure', 'article']['anonyme']1[None]100.0000001.00000011313.00000013133030.000000
2868Discordances Wikidata10838710_12['4', 'mai', '2017', 'discordance', 'remarquer', 'donnée', 'article', 'Wikidata', 'point', 'consulter', 'source', 'fiable', 'souhaitable', 'harmoniser', 'donnée', 'corrigeant', 'sourcer', 'article', 'corrigeant', 'sourcer', 'Wikidata', 'expliquer', 'raison', 'divergence', 'travail', 'effectuer', 'catégorie', 'article', 'information', 'diffèrent', 'Wikidata', 'pouvoir', 'enlever', 'catégorie', 'article', 'information', 'diffèrent', 'Wikidata']['anonyme', 'anonyme']2['2017-05-04T15:23', None]200.0000002.00000023819.0000005338442.000000
2869None9334600_11['avertissement', 'Homonymie', '|', 'revisionid=117142228', '|', 'Pédicelles']['anonyme']1['2015-08-06T22:26']100.0000001.000000166.0000006666.000000
2870None3136640_11['traduire', 'page', 'permettre', 'modifier', 'organisation', 'page', 'original', 'mieux', 'séparer', 'texte', 'incertitude', 'rapport', 'mot', 'aide', 'bienvenue', 'voir', 'page', 'suivi', 'traduction', 'détail']['Eldorino']1['2008-07-17T00:17']001.0000001.00000012020.00000020205656.000000
2871Gospel295640_13['version', 'anglais', 'donne', 'read', 'fiery', 'gospel', 'writ', 'burnished', 'steel', 'traduction', 'français', 'lire', 'ardent', 'texte', 'gospel', 'écrire', 'lisse', 'ligne', 'acier', 'traduire', 'lire', 'ardent', 'texte', 'évangile', 'écrire', 'lisse', 'ligne', 'acier', 'sembler', 'logique', 'non', 'oui', 'mieux', 'traduction', 'partie', 'inspirer', 'http://64.233.167.104/search?q=cache:qWacnGEXtMMJ:laurentmalet.free.fr/mariage/Ceremonie.html++%22J%27ai+lu+un+ardent+texte+de+gospel+%22hl=frlr=lang_fr', 'ici', 'falloir', 'mentionner', 'gospel', 'traduire', 'évangile', 'convier', 'modifier', 'retraduire', 'texte', 'entier', 'y', 'chose', 'modifier', 'vocabulaire', 'grammaire', 'expression', 'couplet', 'faire', 'lecteur', 'mmf--']['Revas', 'ADM', '83.158.47.186']3['2005-06-18T08:21', None, '2011-03-09T11:37']001.0000001.00000015819.333333133111939.666667
2872Trois Milliards de gens sur terre295640_21['Mireille', 'Mathieu', 'performed', 'melodie', 'Wikipedia']['anonyme']1[None]100.0000001.000000155.0000005566.000000
2873Reprise dans la fiction295640_31['morceau', 'apparaître', 'jeu', 'vidéo', 'fallout', '3', 'musique', 'diffuser', 'radio', 'Enclave', 'descendre', 'gouvernement', 'officiel', 'américain']['anonyme']1[None]100.0000001.00000011414.00000014142525.000000